Efficient Matrix-Encoded Grammars and Low Latency Parallelization Strategies for CYK
نویسندگان
چکیده
We present a matrix encoding of contextfree grammars, motivated by hardware-level efficiency considerations. We find efficiency gains of 2.5–9× for exhaustive inference and approximately 2× for pruned inference, resulting in high-accuracy parsing at over 20 sentences per second. Our grammar encoding allows fine-grained parallelism during chart cell population; we present a controlled study of several methods of parallel parsing, and find nearoptimal latency reductions as core-count increases.
منابع مشابه
Iterative CKY Parsing for Probabilistic Context-Free Grammars
This paper presents an iterative CKY parsing algorithm for probabilistic contextfree grammars (PCFG). This algorithm enables us to prune unnecessary edges produced during parsing, which results in more efficient parsing. Since pruning is done by using the edge’s inside Viterbi probability and the upper-bound of the outside Viterbi probability, this algorithm guarantees to output the exact Viter...
متن کاملTo CNF or not to CNF? An Efficient Yet Presentable Version of the CYK Algorithm
The most familiar algorithm to decide the membership problem for context-free grammars is the one by Cocke, Younger and Kasami (CYK) using grammars in Chomsky normal form (CNF). We propose to teach a simple modification of the CYK algorithm that uses grammars in a much less restrictive binary normal form (2NF) and two precomputations: the set of nullable nonterminals and the inverse of the unit...
متن کاملPrincipled Parsing for Indentation-Sensitive Languages
Many languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars are not able to express these layout rules, existing parsers use ad hoc techniques to handle them. These techniques tend to be low-level and operational in nature, and thus forgo the advantages of more declarative specifications like context-free grammar...
متن کاملEfficient Implementation of the Cky Algorithm
When the CKY algorithm is presented in Natural Language Processing literature, it is often is described in high-level pseudo code. The implementation details of the CKY algorithm, despite being critical to efficiency, are rarely (if ever) discussed. In this paper I discuss multiple implementation approaches, and optimizations on these approaches to increase parsing time an order of magnitude wh...
متن کاملBeam-Width Prediction for Efficient Context-Free Parsing
Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CY...
متن کامل